Goto

Collaborating Authors

 specialized hardware


The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic

Neural Information Processing Systems

The emergence of XNOR networks seek to reduce the model size and computational cost of neural networks for their deployment on specialized hardware requiring real-time processes with limited hardware resources. In XNOR networks, both weights and activations are binary, bringing great benefits to specialized hardware by replacing expensive multiplications with simple XNOR operations. Although XNOR convolutional and fully-connected neural networks have been successfully developed during the past few years, there is no XNOR network implementing commonly-used variants of recurrent neural networks such as long short-term memories (LSTMs). The main computational core of LSTMs involves vector-matrix multiplications followed by a set of non-linear functions and element-wise multiplications to obtain the gate activations and state vectors, respectively. Several previous attempts on quantization of LSTMs only focused on quantization of the vector-matrix multiplications in LSTMs while retaining the element-wise multiplications in full precision. In this paper, we propose a method that converts all the multiplications in LSTMs to XNOR operations using stochastic computing. To this end, we introduce a weighted finite-state machine and its synthesis method to approximate the non-linear functions used in LSTMs on stochastic bit streams. Experimental results show that the proposed XNOR LSTMs reduce the computational complexity of their quantized counterparts by a factor of 86x without any sacrifice on latency while achieving a better accuracy across various temporal tasks.


Simultaneous Triggering and Synchronization of Sensors and Onboard Computers

arXiv.org Artificial Intelligence

High fidelity estimation algorithms for robotics require accurate data. However, timestamping of sensor data is a key issue that rarely receives the attention it deserves. Inaccurate timestamping can be compensated for in post-processing but is imperative for online estimation. Simultaneously, even online mitigation of timing issues can be achieved through a relaxation of the tuning parameters from their otherwise more performative optimal values, but at a detriment to performance. To address the need for real-time, low-cost timestamping, a versatile system which utilizes readily-available components and established methods for synchronization is introduced. The synchronization and triggering (of both high- and low-rate sensors) capabilities of the system are demonstrated.


The Synthesis of XNOR Recurrent Neural Networks with Stochastic Logic

Neural Information Processing Systems

The emergence of XNOR networks seek to reduce the model size and computational cost of neural networks for their deployment on specialized hardware requiring real-time processes with limited hardware resources. In XNOR networks, both weights and activations are binary, bringing great benefits to specialized hardware by replacing expensive multiplications with simple XNOR operations. Although XNOR convolutional and fully-connected neural networks have been successfully developed during the past few years, there is no XNOR network implementing commonly-used variants of recurrent neural networks such as long short-term memories (LSTMs). The main computational core of LSTMs involves vector-matrix multiplications followed by a set of non-linear functions and element-wise multiplications to obtain the gate activations and state vectors, respectively. Several previous attempts on quantization of LSTMs only focused on quantization of the vector-matrix multiplications in LSTMs while retaining the element-wise multiplications in full precision.


Towards Kinetic Manipulation of the Latent Space

arXiv.org Artificial Intelligence

The latent space of many generative models are rich in unexplored valleys and mountains. The majority of tools used for exploring them are so far limited to Graphical User Interfaces (GUIs). While specialized hardware can be used for this task, we show that a simple feature extraction of pre-trained Convolutional Neural Networks (CNNs) from a live RGB camera feed does a very good job at manipulating the latent space with simple changes in the scene, with vast room for improvement.


LLMs in Political Science: Heralding a New Era of Visual Analysis

arXiv.org Artificial Intelligence

Interest is increasing among political scientists in leveraging the extensive information available in images. However, the challenge of interpreting these images lies in the need for specialized knowledge in computer vision and access to specialized hardware. As a result, image analysis has been limited to a relatively small group within the political science community. This landscape could potentially change thanks to the rise of large language models (LLMs). This paper aims to raise awareness of the feasibility of using Gemini for image content analysis. A retrospective analysis was conducted on a corpus of 688 images. Content reports were elicited from Gemini for each image and then manually evaluated by the authors. We find that Gemini is highly accurate in performing object detection, which is arguably the most common and fundamental task in image analysis for political scientists. Equally important, we show that it is easy to implement as the entire command consists of a single prompt in natural language; it is fast to run and should meet the time budget of most researchers; and it is free to use and does not require any specialized hardware. In addition, we illustrate how political scientists can leverage Gemini for other image understanding tasks, including face identification, sentiment analysis, and caption generation. Our findings suggest that Gemini and other similar LLMs have the potential to drastically stimulate and accelerate image research in political science and social sciences more broadly.


Characterization of Locality in Spin States and Forced Moves for Optimizations

arXiv.org Artificial Intelligence

Ising formulations are widely utilized to solve combinatorial optimization problems, and a variety of quantum or semiconductor-based hardware has recently been made available. In combinatorial optimization problems, the existence of local minima in energy landscapes is problematic to use to seek the global minimum. We note that the aim of the optimization is not to obtain exact samplings from the Boltzmann distribution, and there is thus no need to satisfy detailed balance conditions. In light of this fact, we develop an algorithm to get out of the local minima efficiently while it does not yield the exact samplings. For this purpose, we utilize a feature that characterizes locality in the current state, which is easy to obtain with a type of specialized hardware. Furthermore, as the proposed algorithm is based on a rejection-free algorithm, the computational cost is low. In this work, after presenting the details of the proposed algorithm, we report the results of numerical experiments that demonstrate the effectiveness of the proposed feature and algorithm.


Computers that power self-driving cars could be a huge driver of global carbon emissions

#artificialintelligence

In the future, the energy needed to run the powerful computers on board a global fleet of autonomous vehicles could generate as many greenhouse gas emissions as all the data centers in the world today. That is one key finding of a new study from MIT researchers that explored the potential energy consumption and related carbon emissions if autonomous vehicles are widely adopted. The data centers that house the physical computing infrastructure used for running applications are widely known for their large carbon footprint: They currently account for about 0.3 percent of global greenhouse gas emissions, or about as much carbon as the country of Argentina produces annually, according to the International Energy Agency. Realizing that less attention has been paid to the potential footprint of autonomous vehicles, the MIT researchers built a statistical model to study the problem. They determined that 1 billion autonomous vehicles, each driving for one hour per day with a computer consuming 840 watts, would consume enough energy to generate about the same amount of emissions as data centers currently do.


What is AI hardware? How GPUs and TPUs give artificial intelligence algorithms a boost

#artificialintelligence

Were you unable to attend Transform 2022? Check out all of the summit sessions in our on-demand library now! Most computers and algorithms -- including, at this point, many artificial intelligence (AI) applications -- run on general-purpose circuits called central processing units or CPUs. Though, when some calculations are done often, computer scientists and electrical engineers design special circuits that can perform the same work faster or with more accuracy. Now that AI algorithms are becoming so common and essential, specialized circuits or chips are becoming more and more common and essential.


Getting Rid of the Deep Learning Silo in the Data Center

#artificialintelligence

With an electrical engineering education from Purdue University (PhD) and the Indian Institute of Technology Bombay (BS, MS), and nearly 25 years of experience in the semiconductor, systems, and hyperscale service provider industries, he has a broad perspective on hardware design and deployment. The elasticity of cloud infrastructure is a key enabler for enterprises and internet services, creating a shared pool of compute resources that various tenants can draw from as their workloads ebb and flow. Cloud tenants are spared the details of capacity and supply planning. This worked well because modern server systems are very efficient at a multitude of general computing tasks. Deep learning, however, creates new complexities for this model.


Iris

#artificialintelligence

A wide range of real-world applications, including computational photography (glint reflection) and augmented reality effects (virtual avatars) rely on accurately tracking the iris within an eye. This is a challenging task to solve on mobile devices, due to the limited computing resources, variable light conditions and the presence of occlusions, such as hair or people squinting. Iris tracking can also be utilized to determine the metric distance of the camera to the user. This can improve a variety of use cases, ranging from virtual try-on of properly sized glasses and hats to accessibility features that adopt the font size depending on the viewer's distance. Often, sophisticated specialized hardware is employed to compute the metric distance, limiting the range of devices on which the solution could be applied.